4 research outputs found

    Machine learning methods for detecting and correcting data errors in water level telemetry systems

    Get PDF
    Water level data from telemetry stations can be used for early warning to prevent risk situations, such as floods and droughts. However, there is a possibility that the equipment in the telemetry station may fail, which will lead to errors in the data, resulting in false alarms or no warning of true alarms. Manually examining data is time-consuming and require expertise. As a result, the automated system is required. There are several algorithms available for detecting and correcting anomalous data, but the question remains as to which algorithm would be most suitable for telemetry data. To investigate and identify such an algorithm, statistical models, machine learning models, deep learning models, and reinforcement learning models are implemented and evaluated. For anomaly detection, we first evaluated statistical models using our modified sliding window algorithm called Only Normal Sliding Windows (ONSW) to assess their performance. We then proposed Deep Reinforcement Learning (DRL) models and compared them to Deep Learning models to determine their suitability for the task. Additionally, we developed a feature extraction approach that combines the saliency map and nearest neighbor extracted feature (SM+NNFE) to improve model performance. Various ensemble approaches were also implemented and compared to other competitive methods. For data imputation, we developed the Full Subsequence Matching (FSM) technique, which fills in missing values by imitating values from the most similar subsequence. Based on the results, machine learning models with ONSW are the best option for identifying abnormalities in telemetry water level data. Additionally, a deep reinforcement learning model could be used to identify abnormalities in crucial stations requiring further attention. Regarding data imputation, our technique outperforms other competitive approaches when dealing with water level data influenced by tides. However, relying solely on a single or limited number of models may be risky, as their performance could deteriorate in the future without being realized. Therefore, building models using ensemble techniques is a viable option for reducing errors caused by this issue

    Novel methods for imputing missing values in water level monitoring data

    Get PDF
    Hydrological data are collected automatically from remote water level monitoring stations and then transmitted to the national water management centre via telemetry system. How- ever, the data received at the centre can be incomplete or anomalous due to some issues with the instruments such as power and sensor failures. Usually, the detected anomalies or missing data are just simply eliminated from the data, which could lead to inaccurate analysis or even false alarms. Therefore, it is very helpful to identify missing values and correct them as accurate as possible. In this paper, we introduced a new approach - Full Subsequence Matching (FSM), for imputing missing values in telemetry water level data. The FSM firstly identifies a sequence of missing values and replaces them with some constant values to create a dummy complete sequence. Then, searching for the most similar subsequence from the historical data. Finally, the identified subsequence will be adapted to fit the missing part based on their similarity. The imputation accuracy of the FSM was evaluated with telemetry water level data and compared to some well-established methods - Interpolation, k-NN, MissForest, and also a leading deep learning method - the Long Short-Term Memory (LSTM) technique. Experimental results show that the FSM technique can produce more precise imputations, particularly for those with strong periodic patterns

    Deep Reinforcement Learning Ensemble for Detecting Anomaly in Telemetry Water Level Data

    Get PDF
    Water levels in rivers are measured by various devices installed mostly in remote locations along the rivers, and the collected data are then transmitted via telemetry systems to a data centre for further analysis and utilisation, including producing early warnings for risk situations. So, the data quality is essential. However, the devices in the telemetry station may malfunction and cause errors in the data, which can result in false alarms or missed true alarms. Finding these errors requires experienced humans with specialised knowledge, which is very time-consuming and also inconsistent. Thus, there is a need to develop an automated approach. In this paper, we firstly investigated the applicability of Deep Reinforcement Learning (DRL). The testing results show that whilst they are more accurate than some other machine learning models, particularly in identifying unknown anomalies, they lacked consistency. Therefore, we proposed an ensemble approach that combines DRL models to improve consistency and also accuracy. Compared with other models, including Multilayer Perceptrons (MLP) and Long Short-Term Memory (LSTM), our ensemble models are not only more accurate in most cases, but more importantly, more reliable

    Developing Ensemble Methods for Detecting Anomalies in Water Level Data

    Get PDF
    Telemetry is an automatic system for monitoring environments in a remote or inaccessible area and transmitting data via various media. Data from telemetry stations can be used to produce early warning or decision supports in risky situations. However, sometimes a device in a telemetry system may not work properly and generates some errors in the data, which lead to false alarms or miss true alarms for disasters. We then developed two types of ensembles: (1) simple and (2) complex ensembles for automatically detecting the anomaly data. The ensembles were tested on the data collected from 9 telemetry water level stations and the results clearly show that the complex ensembles are the most accurate and also reliable in detecting anomalies
    corecore